Bounds on the Rate of 2-D Bit-Stuffing Encoders* 



Ido Tal Ron M. Roth 
Computer Science Department 
Technion, Haifa 32000, Israel. 
Email: {idotal, ronny}@cs . technion . ac . il 



Abstract — A method for bounding the rate of bit-stuffing 
encoders for 2-D constraints is presented. Instead of considering 
the original encoder, we consider a related one which is quasi- 
stationary. We use the quasi-stationary property in order to 
formulate linear requirements that must hold on the probabilities 
of the constrained arrays that are generated by the encoder. 
These requirements are used as part of a linear program. The 
minimum and maximum of the linear program bound the rate 
of the encoder from below and from above, respectively. 

A lower bound on the rate of an encoder is also a lower 
bound on the capacity of the corresponding constraint. For some 
constraints, our results lead to tighter lower bounds than what 
was previously known. 

I. Introduction 

Two-dimensional (2-D) constraints are formally defined in 
lH]. Consider a 2-D constraint § defined over some finite 
alphabet E. Informally, a bit-stuffing encoder for § operates 
as follows. We encode information to an M x N rectangular 
array; namely, we produce an array a e § n S*^^^. We 
first initialize the "boundaries" of the array (formally defined 
later) according to some fixed probability distribution. Then, 
we write to the "interior" of the array in raster fashion: row- 
by-row. The symbol currently written is the result of a coin 
toss. The probability distribution of the coin is a function 
of neighboring symbols, which have akeady been written. 
However, the "coins" used are in fact (invertible) probability 
transformers, the input of which is the information we wish 
to encode. Thus, information can be encoded, and decoded. 

A bit-stuffing encoder is "variable-rate". The bit-stuffing 
technique was initially devised for encoding one-dimensional 
(1-D) constraints ||2l. In 13] and H, bit-stuffing encoders for 
specific 2-D constraints were presented and analyzed. In f5|, 
a slightly different definition of bit-stuffing was used to give 
lower bounds on the capacity of specific 2-D constraints. 

In this work, we derive upper and lower bounds on the rate 
of a general bit-stuffing encoder A lower bound on the rate 
of an encoder is also a lower bound on the capacity of the 
corresponding constraint: 

cap(§) = lim — logJSnE^ 
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For some constraints, our results lead to tighter lower bounds 
on capacity than what was previously known. 
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Fig. 1. Binary aiTay satisfying the kings constraint. If we flip any one (or 
more) of the highlighted "0" bits to "1", then the resulting array will not 
satisfy the kings constraint. 



Fix some 2-D constraint S over an alphabet E. As a running 
example, consider the kings constraint Sgq, defined over the 
binary alphabet Egq = {0. 1} (see Figure [T]). A binary array 
satisfies the kings constraint if each entry set to "1" has all 
of its eight-neighbors set to "0". Namely, two entries equal 
to "1" may not appear consecutively along a row, column, or 
diagonal. 

The rest of this paper is organized as follows. In Sections 
nil and Uni we define our notation and our model of a bit- 
stuffing encoder, respectively. In Section |IV| we define the 
concept of quasi-stationarity. We also prove that, w.l.o.g., we 
may assume that our encoder is quasi-stationary. In Section 
[Vl we take advantage of the quasi-stationary property and 
define a linear program. The minimum (maximum) of the 
linear program bounds the rate of our encoder from below 
(above). Finally, section |Vl] states a generic lower bound on 
capacity, and contains examples where this bound improves 
on previous results. 

We note at this point that although this work deals with 2-D 
constraints, our method can be easily generalized to higher 
dimensions as well. 

II. Notation 
We first set up some notation. 

Parallelogram and rectangle: For AI.N > and t > 0, 
denote 

B^l. = {{^,J) : < i < M , 0<t-i + j<N} . 
Also, for t = 0, denote 

Configuration: Let a — {ai.j)(i.j)eu be a 2-D configuration 
over S. Namely, the index set satisfies U C I?, and for all 
e U we have that Uij G S. 
Shifts: For integers a, (3 we denote the shifted index set as 

(Ta,/3(U) = {{i + a,j + (3) : e U} . 



Also, by abuse of notation, let aa,p(a) be the shifted config- 
uration (with index set cr(U)): 

Restriction of configuration: For an index set 4' C U, denote 

the restriction of a to 'I' by a[5'] = {(i['^]i.j)(i.j)e^- Namely, 



where £ 'J 



Sliift and restrict: Let Ta^piaj'i') be shorthand for 

T-a,/3(a,^') = (cr_a,_/3(a))[^'] . 

Namely, shift the configuration a such that index (a, (3) is now 
index (0, 0), and then restrict to 'I'. 
Boundary: Denote by 9(U,4') the set of all the indexes 
(a, /3) G U for which the "shift and restrict" operation is 
invalid. 

9(U,*) = {(a,/3)eU:a«,;3(*)2U} . 

The index set (9(U,5') is termed the "boundary", and the 
"interior" is 

5(U,*) = U\5(U,*) . 

When U = Bm,n and ^ is understood from the context, we 
abbreviate 

dM,N = d{BM,N, *) , dM,N = 9(BA/^Ar, . 

Figure |2] shows an example of such sets, where 

v& = {(0,-2),(0,-l),(-l,-l),(-l,0),(-l,l)} . (1) 
Restriction of constraint: Denote the restriction of S to U by 
§[U] = {a : there exists a' £ § such that a'[U] = a} . 
If U = Bm,n, then we abbreviate 

Lexicograpllic ordering: We define a lexicographic ordering 
-< on as 

ii',j')-<ih3) ^ {i' < i) or {i' ^ i and f < j) . 
Also, we define the index set 

T,,, . (2) 

III. Bit stuffer definitions 

In this section, we present the formal definition of bit- 
stuffing encoders. A bit-stuffing encoder for § is defined 
through a triple 



The set 



* C Ton 



(3) 



is termed the neighbor set. The conditional probability func- 
tion fi, 

: S X Sivp] ^ [0, 1] , 
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Fig. 2. The index (0, 0) is represented by •. We take ^ as in (T), and it is 
represented by the diagonally striped cells. We set A/ = 5 and N = 8. The 
index set Bjv/,JV is represented by the shaded part (both light and dark). The 
boundary Om n is represented by the lighter shaded part, while the interior 
dhl N is represented by the darker shaded part. 
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Fig. 3. The two non-trivial configurations for in our running example, 
where • designates coordinate (0,0). 



is a conditional probability distribution on S, given an element 
of §['5]. For AI, N > 0, the boundary probability function 



M.N\ 



[0,1] 



is a probability distribution on §[9M,Af]- From here onward, 
we fix £. 

For our running example, let the neighbor set ^fsq = 'J' be 
as in ([B, and define e §sq[*] as 

^'o%-o ^^1-0 ^L°ti=o ^l,=o ^%=0 

VoU^^ ^o!-i-0 ^-'1,-1=0 ¥^^'[0=0 ^%^0 

(see Figure [3). Also, take the conditional probability function 
as 



/lsq(l|(p) = 1 " Msq(0|<^) 



0.258132 (p = v3(0) 
0.312231 (p = (^(1' (4) 
otherwise . 



Thus, /Xsq ( • I • ) can be implemented using two coins (one for the 
context (/jf"^ and one for (jC^^^). For our running example, we 
take Sm,n as the function equal to 1 for the all zero boundary 
(0)(j,i)e9A/.N' ^"'l ^h other members of Ssq[9A/,Ar]- 

Given integers M, N > 0, the bit-stuffing encoder £ defines 
a probability measure on the elements a = {<ii,j)(i.j)eBM « °f 
Bm,n, in the following manner As a first step, we set the 
boundary a[dM.N], according to the probability distribution 
Sm,n- Next, we write the contents of the interior of a in 
raster fashion: row-by-row, from left to right. The probability 
of writing w G E in entry E dM,N is given by 

Prob(aij = w) = iJ.{w\{nj{a, *)) . 

Specifically, note that when writing entry we have by 

(O that Tij (a) is a function of entries of a which have already 
been written. A fundamental requirement for and /i is that 



for every M, N, and jv, the support of the probability 
measure thus defined is contained in Sm^n- 
Let 

A(£:,M,7V) = A=(A,,,)(,^,)gB„,„ 

be a random variable taking values on Sm^n according to the 
measure we have just defined. Namely, 

Prob(A = a) = 5A/,Jv(a[9A/,Ar])- 

n I^^J ■ (5) 

We now explain how £ is used to actually encode infor- 
mation. The "coin tosses" corresponding to the invocations 
of /i are, in effect, a function of the information we wish to 
encode. Specifically, the values of the tosses are the output 
of distribution transformers on the input stream (the mapping 
from the input stream to the sequence of coin tosses is one-to- 
one) |,4J. Thus, we may encode information, and also decode 
it. So, we define the rate of our encoder as 

H{A[dMMM[dM,N]) 



R(£) = liminf 

M.N- - - 



M ■ N 



where 



Note that since 



we also have that 



A = A{£, M, N) 



9 At N 

liminf ' \' ~ 1 

M,N~>oo M ■ N 



R(£) = liminf 



H{A{£,M,N)) 
M -N 



IV. QUASI-STATIONARITY 

Fix fc > 0. Define the random variable 

A^''\£,M,N):^A^'''> 

taking values on §m.n as follows. For w G Sm,n, we have 
1 



Prob(A('=)=u;) 



J2 Prob(ff_,,_,(A'[BM,w])=M;) , 

0<i j'<fc 



where 



A' = A{£, M + fc - 1, TV + fc - 1) 



Namely, given A', we randomly and uniformly pick an M x N 
sub-configuration of it, and shift accordingly. The usefulness 
of A^'^^ is that it is "quasi-stationary" [3] §6]. 

Lemma 1 (lt3j Proposition 6.1]): Let £, M, N, and fc be 
given. Let U C Bm,n be an index set, and let w G S[U] 
be given. Suppose that for given integers a, /? we have that 
cra,/3(U) C Bmm- Denote A^'^' = A^''\£,M,N). Then, 



Prob(A('=)[U] = ' 



\a\ + \3\ 



< 



Next, we show that A^*^') is a random variable correspond- 
ing to an encoder very similar to £. First, define j'-''' ~ 
(<5m Ar)M,Ar>o, where 



S[5 



M.N\ 



[0,1] 



(that is, iJ^^jv ^ probabihty distribution on §[3Af^Ar]), and 
for every d e §[9M,Ar], 

(5g^(d) = Proh{A^''\£,M,N)[dMM = d) . 
Next, define the encoder £^''^ as 

= (6) 



Lemma 2 (f3^ Proposition 6.2]): The probability distribu- 
tions of A^''') {£,M,N) and A{£^''\M,N) are equal. 

The next lemma essentially states that the normalized en- 
tropies of A and yl'*^' are asymptotically equal (for Af , N 
oo and fc fixed). The proof is straightforward. 

Lemma 3: Fix an integer fc > 0. Then, 

R{£) = . 

It follows from Lemma [3] that we can obtain bounds on 
R{£) by bounding instead the rate of the quasi-stationary 
encoder £'--^\ And, indeed, quasi-stationarity will turn out to 
be useful for this purpose. 

V. Linear program 

In this section, we present lower and upper bounds on R{£). 
The bounds will be expressed as values of corresponding linear 
programs. 

For r, s > and t > 0, we say that the parallelogram 
is vaUd with respect to the neighbor set ^ if the set 

{(a,/3) : (*U(0,0))Ca„,MB«)} (7) 



is non-empty. Namely, some shift of the parallelogram in- 
cludes the neighbor set and (0,0). From here onward, we 
fix r, s, and t so that B^*! is valid. Also, we fix u and v, 
where [u, v) is the largest element of (|7]i, with respect to the 
ordering 



r = 4, s = 5 
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Fig. 4. The index sets A, and F. The index sets ai'e shown for r = 4, 
s = 5, and for both t = and t = 1. The index (0, 0) is represented by 
•. We take as in (T), and it is represented by the diagonally striped cells. 
The index set A is represented by the shaded part (both light and dark). The 
boundary F is represented by the lighter shaded part. Note that C F C A. 



Denote (see Figure |4| Proof: Consider E'^^^ (as defined by For given M 

and N, define the index sets 



A = a„,„(B(*)) , r = a(A,*). 

For an as yet unspecified probability distribution over §[r] 



B = a(BM,jv,A) , U =a(BM,jv,A) . 



I \ ^ smi Obviously, 

lim -Ll^ = l. (11) 
define the random variable Y taking values on §[A] as follows. m,n^oo m ■ i\ 

For y e §[A], ^gno^g ^(fc) ^ ^(fe) j^j^ iV). By (dB and Lemma E] 

Prob(y = j/)=^(y[r]) n ^(y,,,|T,,,(y,vI/)) (8) [U] |AW [B]) 
(..■)eA\r ) = M^i^'^oo \U\ • 

(compare to ©). Note that Prob(y = y) is a Hnear function Notice that C A. Thus, U C Bm.n, and we have 
of the various 7r(z)'s. Next, define 

A' = a„,„(B(*2,J, A" = a„,.(B(*Li), 



i/(AW[U]|A«[B]) = ^ i/(A|5|A«[T,,,nBM,iv]) 



(»j)eu 



and = J2 iy(A£V.,,(AW,vI/)) 



r' = a(A',*), r" = a(A",^'). 



(i,i)eu 



Consider the Unear program in Figure |5] First, note that where Tij is as defined in and the last equality follows 

it is indeed a linear program. Namely, recall that by ([8]l, the from (|5]l. 

probability distribution of F is a Hnear function of the 7r(z)'s. We now prove the following claim: for all (i, j) G U, we 

Thus, both sides of (|9]l and ( fTOl l are also linear functions of have that 

the 7r(z)'s. For example, the LHS of © equals lp|^^„ < \t,^j {A'-''^ , ^f)) . (12) 

n{y[T]) Y\_ ■ To see this, fix some e U, and define for all z e §[r], 

Denote the value of the linear program when minimizing 

by lp;;.n - 1p:.„(^), and when maximizing by Ip^,,, ^ Substituting .(z^ p( ) . the objective func 
Vn^^^iS). Since © and ® are very similar, we may in- is equal to iJC^^/lr^-.X^e^), *)). Also, notice that the proba- 
tuitively say that £ outputs Y. The optimization is over bility distribution of F is equal to that of (A^*^) , A). By the 
the probability distribution of the boundary Y[T]. The linear fact that A^^) is quasi-stationary (and thus, so is every sub- 
requirements © and (doll are added to force the distribution of configuration of it), all the linear requirements in the modified 
Y to be stationary. The objective function is the rate at point linear program are satisfied (i.e., the p('=)(z)'s form a feasible 
(^0^0). solution). So, our claim (fTSl i is proved. 
The following theorem is our main result. 

Theorem 4: For the linear program in Figure |5] we have 

that 

lp:,i„ < R{£) < Ip:,,, . Minimize (Maximize) 

- 7r(2;) /i(TO|z[vE']) logj ^t(ui|2;[*]) 
In order to prove the theorem, we first state and prove a ^ , 

lemma, on a slightly modified linear program. 

Lemma 5: Fix fc > 0, and replace © and m in Figure E] ""^^ "^'^^^^^ (^(^) ^ ^ ^[^D' ^"''J'^^' '° following: 

by ^(^) - 1 • 



Prob(y[r'] = z') ~ Prob(r[ao,i(r')] = ao,i(^')) 



and 



1 ^es[r] 
< - 

~ k For all z G §[r], 

■k{£) > 



For all z' G §[r'], 

Prob(y[r'] = z') = Prob(y[ao,i(r')] = <^o,i(^')) • (9) 



Prob(r[r"] = z") ~ Prob(r[ai,_,(r")] = r7i,_,(z")) 

/ i + l 

< , For all z" G §[r"], 

respectively. Prob(r[r"] = z") = Prob(y[ai,_i(r")] = • dO) 

Denote the minimum and maximum of the resulting linear 

program as lp[^l^ and Ip^ax' respectively. Then, 

^1^-^ Fig. 5. Linear program. The minimum (maximum) value is denoted IPj"^;^ 

lYia-x. ■ (^Pmax) ^ lower (upper) bound on /?(£*). 



TABLE I 

Bounds on the rates of encoders using a small number of 

COINS. 



Constraint 


Coins 


IPmin 


^Pmax 


13 1 


(2, oo)-RLL 


1 


0.440722 


0.444679 


0.4267 


(3, oo)-RLL 


1 


0.349086 


0.386584 


0.3402 


n.i.b. 


2 


0.917730 


0.919395 


0.9127 


(1, oo)-RLL 


3 


0.587776 


0.587785 





We conclude that Ip^^;^ < R{£'^''^). Thus, by Lemma El 

1p!1 < R{£) ■ 

A similar proof yields R{£) < Ipmlx- * 
Proof of Theorem ^ First, note that the modified linear 
program defined in Lemma|5]has at least one feasible solution, 
whenever M and N are large enough so that U is 
non-empty. 

For a given k, denote the minimizing variable values of the 
modified linear program by 7r*^'')(2), z e §[r]. Think of these 
variable values as a vector 

7r('=) = (^W(z)),,s[r] . 

By compactness, the series tt''^^ k = 1,2,..., has a cluster 
point, which we denote by tt*. Obviously, tt* implies a 
feasible solution for the linear program in Figure |5] More so, 
we must also have that the value of the objective function for 
this feasible solution is a lower bound on R{£). So, 

Ip*^,^ < R{£) . 

Similarly, we deduce that 

m < IPmax ■ 

■ 

Remark: While the definition of the encoder £ includes 
(besides and /i) also the boundary distributions d = 
{Sm,n)m,n>o, the bounds lp*ii„ and lp*jax do not depend 
on S. 

Applying Theorem |4] to our running example, with r = 4, 
s = 5, t — 1, gives 

0.42430953 < R{£) < 0.42442765 . 

To the best of our knowledge, our running example is the high- 
est rate bit-stuffing encoder known, given that we are allowed 
to use at most two coins (i.e., two probability transformers). 
For comparison, we have calculated by the method presented 
in |l6| that 

cap(§sq) < 0.425078 . 

Namely, with two coins we achieve a rate that is only 0.2% 
less than capacity. 

Table U contains our results for a number of constraints. We 
abbreviate the "no isolated bits" constraints as "n.i.b.". In the 
first three rows, we compare ourselves to the results in |f3l 
(Table 1 and Equation (12)). For the comparison to be fair, 
we restrict ourselves to the neighbor sets 5* used in |3|, and 
use the same number of coins. 



TABLE II 

Bounds on the rates of certain bit-stuffing encoders. 



Constraint 


Coins 




^Pmax 


(71 


Others 


(2,oo)-RLL 


5 


0.44420 


0.4450 


0.44417 


0.4423 


(3, oo)-RLL 


2 


0.35973 


0.3690 


0.36562 


0.3641 


(0, 2)-RLL 


66 


0.81549 


0.8169 


0.81600 


0.7736 




18 


0.81501 


0.8162 








9 


0.81073 


0.8197 






n.i.b. 


56 


0.92264 


0.9238 


0.92086 


0.9156 



VI. A LOWER BOUND ON CAPACITY 

The following is a straightforward corollary of Theorem |4] 

Corollary 6: For every bit-stuffing encoder £, 

\pl,^J£) < cap(§) . 

Thus, we can use the minimizing linear program of Figure |5] 
to bound cap(§) from below. 

To obtain better lower bounds on cap(§), we can search for 
good 4* and fi. For instance, for the set ^> = "ifsq in dTJ, the 
function in © was obtained by maximizing IpJ^jn over all 
fi that form with (and every 6) a bit-stuffing encoder for 
§sq- Better lower bounds can be obtained by looking at larger 
sets (at the price of higher computational complexity). 

Table HIl summarizes our results for certain constraints. The 
last two columns contain previously published lower bounds 
on the capacity of the corresponding constraint. We have 
highlighted values of IpJ^jn which are an improvement of 
these previously known results. The bounds in the penultimate 
column are taken from Q, which was published recently. We 
note that the method used in (T\ is quite different than ours. 
As can be seen, both |7| and our method are comparable. The 
bounds in the last column are taken from IS], (|5], (jO), and ifTOl . 
respectively: they were the the best known when our method 
was first published in |11| (at the same time as Q). 
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